910 DMSs was the main contributor to each of 1,250 DEGs.
ds, it was aimed to investigate whether the top-ranked DMS in a
n model for a DEG was a local or a remote methylation site in the
Based on the information, a statistical analysis was carried out to
te the trend of the methylation-to-expression interplay pattern,
g the local or remote interplay.
ose ܠ was used to represent a differential expression vector of
EG, which was the target (dependent) variable of a regression
n such a model, 910 DMSs were treated as the regressors or the
ent variables. Moreover, w was used to represent a vector of
n coefficients or model parameters for 910 regressors and M was
epresent a matrix of the differential methylation ratios of 910
The M matrix has 114 rows and 910 columns. The regression
r the gth DEG was defined as below,
ܠൌ݂ሺۻ, ܟሻ
n such a regression model, f was designed as a regression function,
as either linear or nonlinear. In a constrained (such as RLR) linear
n model, the above equation was simplified as below,
ܠൌۻܟߣܟ௧ܟ
were 1,250 such regression models for 1,250 DEGs. These
were designed to investigate how DMSs, which were from either
emote methylation sites, contributed to the differential expression
attern) of the gth DEG. Only the top-ranked DMSs were used for
ct analysis in this chapter. If a remote DMS was ranked at the top,
nce between this remote DMS and the gth DEG was then
ted. The Lasso, RLR, SVM and random forest were used to
e relationship between the variables (between the gth DEG and all
Ss) and rank variables. Figure 4.29 shows the R-square
ments for four models. It can be seen that four models all fitted to
well.